刻板印象,偏见和歧视已在机器学习(ML)方法(例如计算机视觉(CV)[18,80],自然语言处理(NLP)[6]或两者兼有大图像和大图像和两者兼而有之)标题模型,例如OpenAI剪辑[14]。在本文中,我们评估了ML偏差如何在世界内部和自主作用的机器人中表现出来。我们审核了最近发表的几种剪贴式机器人操纵方法之一,向其呈现在表面上有人脸的图片,这些物体在种族和性别之间各不相同,以及包含与常见刻板印象相关的术语的任务说明。我们的实验明确表明机器人对性别,种族和科学持有的较大的构成观念的作用,并大规模地划分了。此外,经过审核的方法不太可能认识有色人种和有色人种。我们的跨学科社会技术分析跨越了科学技术与社会(STS),批判性研究,历史,安全,机器人技术和AI等领域和应用。我们发现,由大型数据集和溶解模型提供动力的机器人(有时称为“基础模型”,例如剪辑),其中包含人类风险在物理上放大恶性刻板印象;而且,仅纠正差异将不足以使问题的复杂性和规模不足。取而代之的是,我们建议机器人学习方法在适当的时候暂停,重新设计甚至损坏,直到结果被证明是安全,有效和公正的,才能暂停,重新工作甚至损坏其他有害结果。最后,我们讨论了有关身份安全评估框架和设计正义等主题的新的跨学科研究的全面政策变化,以及更好地理解和解决这些危害的主题。
translated by 谷歌翻译
Understanding deep learning model behavior is critical to accepting machine learning-based decision support systems in the medical community. Previous research has shown that jointly using clinical notes with electronic health record (EHR) data improved predictive performance for patient monitoring in the intensive care unit (ICU). In this work, we explore the underlying reasons for these improvements. While relying on a basic attention-based model to allow for interpretability, we first confirm that performance significantly improves over state-of-the-art EHR data models when combining EHR data and clinical notes. We then provide an analysis showing improvements arise almost exclusively from a subset of notes containing broader context on patient state rather than clinician notes. We believe such findings highlight deep learning models for EHR data to be more limited by partially-descriptive data than by modeling choice, motivating a more data-centric approach in the field.
translated by 谷歌翻译
In recent years, the development of accurate deep keyword spotting (KWS) models has resulted in KWS technology being embedded in a number of technologies such as voice assistants. Many of these models rely on large amounts of labelled data to achieve good performance. As a result, their use is restricted to applications for which a large labelled speech data set can be obtained. Self-supervised learning seeks to mitigate the need for large labelled data sets by leveraging unlabelled data, which is easier to obtain in large amounts. However, most self-supervised methods have only been investigated for very large models, whereas KWS models are desired to be small. In this paper, we investigate the use of self-supervised pretraining for the smaller KWS models in a label-deficient scenario. We pretrain the Keyword Transformer model using the self-supervised framework Data2Vec and carry out experiments on a label-deficient setup of the Google Speech Commands data set. It is found that the pretrained models greatly outperform the models without pretraining, showing that Data2Vec pretraining can increase the performance of KWS models in label-deficient scenarios. The source code is made publicly available.
translated by 谷歌翻译
培训深度神经网络消耗了许多计算中心的计算资源份额。通常,采用蛮力的方法来获得高参数值。我们的目标是(1)通过启用对大型神经网络的二阶优化方法来增强此功能,以及(2)对特定任务进行性能优化器进行调查,以建议用户最适合他们的问题。我们介绍了一种新颖的二阶优化方法,该方法仅需要Hessian对向量的影响,并避免明确设置大型网络的Hessian的巨大成本。我们将提出的二阶方法与两个最先进的优化器进行了比较,这些方法在五个代表性的神经网络问题上进行了比较,包括回归和来自计算机视觉或变异自动编码器的非常深的网络。对于最大的设置,我们将优化器与HOROVOD有效平行,并将其应用于8 GPU NVIDIA P100(DGX-1)机器。
translated by 谷歌翻译
期刊影响因素(JIF)通常等同于期刊质量和提交给该期刊的论文的同行评审质量。我们通过分析提交给1,644家医学和生命科学期刊的10,000个同行评审报告,研究了同行评审与JIF的内容之间的关联。两名研究人员手工编码了2,000个句子的随机样本。然后,我们训练了机器学习模型,以将所有187,240个句子分类为贡献或不为内容类别做出贡献。我们研究了JIF DICILES定义的十组期刊与使用线性混合效应模型的同行评审的内容之间的关联,并调整了评论的长度。 JIF的范围为0.21至74.70。同行评审长度从最低(单词中位数185)增加到JIF组(387个单词)。分配给不同内容类别的句子的比例甚至在JIF组中也有很大变化。为了彻底,与最低的JIF组相比,关于“材料和方法”的句子在最高的JIF期刊中更为普遍(7.8个百分点; 95%CI 4.9至10.7%)。 “演示和报告”的趋势朝相反的方向发展,最高的JIF期刊对此类内容的重视程度较小(差异-8.9%; 95%CI -11.3至-6.5%)。为了有助于,对更高的JIF期刊的评论更少关注“建议和解决方案”,而提供的示例少于较低的影响因素期刊。对于其他内容类别而言,没有,或者只有很小的差异。总之,在讨论使用的方法时,在提出解决方案和提供示例方面,在讨论所使用的方法但较小的帮助时,较高的JIF期刊的同行评审往往更为透彻。差异是适度的,可变性很高,表明JIF是对单个手稿的同伴评论质量的不良预测指标。
translated by 谷歌翻译
The occurrence of vacuum arcs or radio frequency (rf) breakdowns is one of the most prevalent factors limiting the high-gradient performance of normal conducting rf cavities in particle accelerators. In this paper, we search for the existence of previously unrecognized features related to the incidence of rf breakdowns by applying a machine learning strategy to high-gradient cavity data from CERN's test stand for the Compact Linear Collider (CLIC). By interpreting the parameters of the learned models with explainable artificial intelligence (AI), we reverse-engineer physical properties for deriving fast, reliable, and simple rule-based models. Based on 6 months of historical data and dedicated experiments, our models show fractions of data with a high influence on the occurrence of breakdowns. Specifically, it is shown that the field emitted current following an initial breakdown is closely related to the probability of another breakdown occurring shortly thereafter. Results also indicate that the cavity pressure should be monitored with increased temporal resolution in future experiments, to further explore the vacuum activity associated with breakdowns.
translated by 谷歌翻译